Fundamental limits for information retrieval

نویسندگان

  • John T. Coffey
  • Matthew Klimesh
چکیده

The fundamental limits of performance for a general model of information retrieval from databases are studied. In the scenarios considered a large quantity of information is to be stored on some physical storage device. Requests for information are modeled as a randomly generated sequence with a known distribution. The requests are assumed to be “context-dependent”, i.e., to vary according to the sequence of previous requests. The state of the physical storage device is also assumed to depend on the history of previous requests. In general the logical structure of the information to be stored does not match the physical structure of the storage device, and consequently there are nontrivial limits on the minimum achievable average access times, where the average is over the possible sequences of user requests. The paper applies basic information-theoretic methods to establish these limits and demonstrates constructive procedures that approach them, for a wide class of systems. Allowing redundancy greatly lowers the achievable access times, even when the amount added is an arbitrarily small fraction of the total amount of information in the database. The achievable limits both with and without redundancy are computed; in the case where redundancy is allowed the limits essentially coincide with lower limits for more general storage systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارزیابی نرم‌افزارهای جامع کتابداری تحت وب پارس‌آذرخش، نوسا و نمایه در بازیابی اطلاعات

Purpose: Since information storage and retrieval is among fundamental tasks of libraries and information dissemination centers, finding proper software in this field is of high significance. The purpose of the present study is to evaluate web-based librarianship software Pars azarakhsh, Nosa, and Namayeh in information retrieval. Methodology: The present study is an applied one. Its methodolog...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Cache-Aided Private Information Retrieval with Partially Known Uncoded Prefetching: Fundamental Limits

We consider the problem of private information retrieval (PIR) from N non-colluding and replicated databases, when the user is equipped with a cache that holds an uncoded fraction r of the symbols from each of the K stored messages in the databases. This model operates in a two-phase scheme, namely, the prefetching phase where the user acquires side information and the retrieval phase where the...

متن کامل

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Information Theory

دوره 46  شماره 

صفحات  -

تاریخ انتشار 2000